Approximate Privacy-Preserving Data Mining on Vertically Partitioned Data
نویسندگان
چکیده
In today’s ever-increasingly digital world, the concept of data privacy has become more and more important. Researchers have developed many privacy-preserving technologies, particularly in the area of data mining and data sharing. These technologies can compute exact data mining models from private data without revealing private data, but are generally slow. We therefore present a framework for implementing efficient privacy-preserving secure approximations of data mining tasks. In particular, we implement two sketching protocols for the scalar (dot) product of two vectors which can be used as sub-protocols in larger data mining tasks. These protocols can lead to approximations which have high accuracy, low data leakage, and one to two orders of magnitude improvement in efficiency. We show these accuracy and efficiency results through extensive experimentation. We also analyze the security properties of these approximations under a security definition which, in contrast to previous definitions, allows for very efficient approximation protocols.
منابع مشابه
Privacy Preserving Näıve Bayes Classifier for Vertically Partitioned Data
Privacy-Preserving Data Mining – developing models without seeing the data – is receiving growing attention. This paper assumes a privacy-preserving distributed data mining scenario: data sources collaborate to develop a global model, but must not disclose their data to others. Näıve Bayes is often used as a baseline classifier, consistently providing reasonable classification performance. This...
متن کاملPrivacy Preserving Naïve Bayes Classifier for Vertically Partitioned Data
Privacy-Preserving Data Mining – developing models without seeing the data – is receiving growing attention. This paper assumes a privacy-preserving distributed data mining scenario: data sources collaborate to develop a global model, but must not disclose their data to others. Näıve Bayes is often used as a baseline classifier, consistently providing reasonable classification performance. This...
متن کاملPrivacy Preserving ID3 over Horizontally, Vertically and Grid Partitioned Data
We consider privacy preserving decision tree induction via ID3 in the case where the training data is horizontally or vertically distributed. Furthermore, we consider the same problem in the case where the data is both horizontally and vertically distributed, a situation we refer to as grid partitioned data. We give an algorithm for privacy preserving ID3 over horizontally partitioned data invo...
متن کاملPrivacy Preserving CART Algorithm over Vertically Partitioned Data
Data mining classification algorithms are centralized algorithm and works on centralized database. In this information age, organizations uses distributed database. Since data mining of private data is one of the keys to success for an organization, it is a challenging task to implement data mining in distributed database. Collaboration of different organization brings mutual benefits to the pa...
متن کاملPrivacy Preserving Association Rule Mining in Vertically Partitioned Data
Data mining technology has emerged as a means for identifying patterns and trends from large quantities of data. This paper presents privacy preserving association rule mining across vertically partitioned data. We present an efficient algorithm to discover association rules with minimum levels of support and confidence, from heterogeneous data distributed across 2 parties, while preventing eit...
متن کاملPrivacy Preserving Data Mining over Vertically Partitioned Data
Vaidya, Jaideep Shrikant. Ph.D., Purdue University, August, 2004. Privacy Preserving Data Mining over Vertically Partitioned Data. Major Professor: Chris Clifton. The goal of data mining is to extract or “mine” knowledge from large amounts of data. However, data is often collected by several different sites. Privacy, legal and commercial concerns restrict centralized access to this data. Theore...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012